Action Time Sharing Policies for Ergodic Control of Markov Chains

نویسندگان

  • Amarjit Budhiraja
  • Xin Liu
  • Adam Shwartz
چکیده

Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long term average cost for such a control policy, for a broad range of one stage cost functions, is the same as that for the associated stationary Markov policy. In addition, ATS policies are well suited for a range of estimation, information collection and adaptive control goals. To illustrate the possibilities we present two examples: The first demonstrates a construction of an ATS policy that leads to consistent estimators for unknown model parameters while producing the desired long term average cost value. The second example considers a setting where the target stationary Markov control q is not known but there are sampling schemes available that allow for consistent estimation of q. We construct an ATS policy which uses dynamic estimators for q for control decisions and show that the associated cost coincides with that for the unknown Markov control q. AMS 2000 subject classifications: 90C40, 60K15

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ergodic Control of a Singularly Perturbed Markov Process in Discrete Time with General State and Compact Action Spaces

Ergodic control of singularly perturbed Markov chains with general state and compact action spaces is considered. A new method is given for characterization of the limit of invariant measures, for perturbed chains, when the perturbation parameter goes to zero. It is also demonstrated that the limit control principle is satisfied under natural ergodicity assumptions about controlled Markov chain...

متن کامل

Estimation of the Entropy Rate of ErgodicMarkov Chains

In this paper an approximation for entropy rate of an ergodic Markov chain via sample path simulation is calculated. Although there is an explicit form of the entropy rate here, the exact computational method is laborious to apply. It is demonstrated that the estimated entropy rate of Markov chain via sample path not only converges to the correct entropy rate but also does it exponential...

متن کامل

Uniform Recurrence Properties of Controlled Diffusions and Applications to Optimal Control

In this paper we address an open problem which was stated in [A. Arapostathis et al., SIAM J. Control Optim., 31 (1993), pp. 282–344] in the context of discrete-time controlled Markov chains with a compact action space. It asked whether the associated invariant probability distributions are necessarily tight if all stationary Markov policies are stable, in other words if the corresponding chain...

متن کامل

Time-Sharing Policies for Controlled Markov Chains

We propose a class of non-stationary policies called \policy time sharing" (p.t.s.), which possess several desirable properties for problems where the criteria are of the average-cost type; an optimal policy exists within this class, the computation of optimal policies is straightforward, and the implementation of this policy is easy. While in the nite state case stationary policies are also kn...

متن کامل

Optimal Control of Ergodic Continuous-Time Markov Chains with Average Sample-Path Rewards

In this paper we study continuous-time Markov decision processes with the average sample-path reward (ASPR) criterion and possibly unbounded transition and reward rates. We propose conditions on the system’s primitive data for the existence of -ASPR-optimal (deterministic) stationary policies in a class of randomized Markov policies satisfying some additional continuity assumptions. The proof o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • SIAM J. Control and Optimization

دوره 50  شماره 

صفحات  -

تاریخ انتشار 2012